Skip to content

[Codex] Add handling for Conversational RAG to Validator API#84

Closed
ulya-tkch wants to merge 6 commits intomainfrom
uly-add-conversational
Closed

[Codex] Add handling for Conversational RAG to Validator API#84
ulya-tkch wants to merge 6 commits intomainfrom
uly-add-conversational

Conversation

@ulya-tkch
Copy link
Contributor

No description provided.

@ulya-tkch ulya-tkch requested a review from elisno May 30, 2025 22:06
assert all(scores[k]["score"] == raw_scores[k]["score"] for k in raw_scores)


def test_prompt_tlm_with_message_history() -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test to confirm there is no query rewriting happening, whenever this is the first user message

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test to confirm that the primary TrustworthyRAG.score(prompt, response) call happens with prompt reflecting the full chat history, not with prompt reflecting the rewritten query.

Confirm you are using this TLM utils method:
cleanlab/cleanlab-tlm@a479e32

to turn the chat history into a prompt string.

return "other_issues"


def validate_messages(messages: Optional[list[dict[str, Any]]] = None) -> None:
Copy link
Contributor

@elisno elisno Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this name validate_messages should be more carefully chosen when the entire validator module reserves the name method validate in Validator for looking at the trustworthiness & Eval scores.

I'd bet we wouldn't change the Validator.validate api, but we could find a different name for validate_messages since it behaves quite differently.

Copy link
Contributor

@elisno elisno Jun 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider having validate_messages take messages as a required (positional argument):

Suggested change
def validate_messages(messages: Optional[list[dict[str, Any]]] = None) -> None:
def validate_messages(messages: list[dict[str, Any]]) -> None:

Everywhere it's being called, it takes in a messages argument.
The caller already sets a default value for that argument, so I'd advise against setting default values in two function signatures.

codex_answer, _ = self._project.query(question=query, metadata=metadata)
return codex_answer

def _maybe_rewrite_query(self, *, query: str, messages: list[dict[str, Any]]) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This _maybe... prefix implies that we might get something different from the method, other than a string. Should the check for self._tlm be done by the caller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the maybe is supposed to suggest we might re-write the query or not

@ulya-tkch
Copy link
Contributor Author

closed because conversational capability moved to the backend

@ulya-tkch ulya-tkch closed this Jun 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants